EN FR
EN FR


Section: New Results

Semantic graph exploration through interesting aggregates

RDF graphs can be large and complex; finding out interesting information within them is challenging. One easy method for users to discover such graphs is to be shown interesting aggregates (under the form of two-dimensional graphs, i.e., bar charts), where interestingness is evaluated through statistics criteria. While well understood for relational data, such exploration raises multiple challenges for RDF: facts, dimensions and measures have to be identified (as opposed to known beforehand); as there are more candidate aggregates, assessing their interestingness can be very costly; finally, ontologies bring novel specific challenges through the presence of implicit data, but also novel opportunities, enabling ontology-driven exploration from an aggregate initially proposed by the system.

The system Dagger we had previously proposed (2017) pioneered this approach, however its is quite inefficient, in particular due to the need to evaluate numerous, expensive aggregation queries.

In 2019, we have built upon Dagger to develop more efficient and more expressive versions thereof. Thus:

  • In [22], we describe Dagger +, which builds upon Dagger and leverages sampling to speed up the evaluation of potentially interesting agregates. We show that Dagger + achieves very significant execution time reductions, while reaching results very close to those of the original, less efficient system.

  • Going beyond the expressive power of (candidate aggregates enumerated by) Dagger , we have developed and demonstrated [15] Spade , a generic, extensible framework, which we instantiated with: (i) novel methods for enumerating candidate measures and dimensions in the vast space of possibilities provided by an RDF graph; (ii) a set of aggregate interestingness functions; (iii) ontology-based interactive exploration, and (iv) efficient early-stop techniques for estimating the interestingness of an aggregate query. A multi-dimensional aggregate automatically identified by Spade appears in Figure 3.

Figure 3. Interesting multi-dimensional aggregate automatically identified by Dagger .
IMG/Dagger-screenshot.jpg